Goto

Collaborating Authors

 attribution map





Deep Model Transferability from Attribution Maps

Jie Song, Yixin Chen, Xinchao Wang, Chengchao Shen, Mingli Song

Neural Information Processing Systems

Unlike the seminal work of taskonomy that relies on a large number of annotations as supervision and is thus computationally cumbersome, the proposed approach requires no human annotations and imposes no constraints on the architectures of the networks.


On the notion of missingness for path attribution explainability methods in medical settings: Guiding the selection of medically meaningful baselines

Geiger, Alexander, Wagner, Lars, Rueckert, Daniel, Wilhelm, Dirk, Jell, Alissa

arXiv.org Artificial Intelligence

The explainability of deep learning models remains a significant challenge, particularly in the medical domain where interpretable outputs are critical for clinical trust and transparency. Path attribution methods such as Integrated Gradients rely on a baseline representing the absence of relevant features ("missingness"). Commonly used baselines, such as all-zero inputs, are often semantically meaningless, especially in medical contexts. While alternative baseline choices have been explored, existing methods lack a principled approach to dynamically select baselines tailored to each input. In this work, we examine the notion of missingness in the medical context, analyze its implications for baseline selection, and introduce a counterfactual-guided approach to address the limitations of conventional baselines. We argue that a generated counterfactual (i.e. clinically "normal" variation of the pathological input) represents a more accurate representation of a meaningful absence of features. We use a Variational Autoencoder in our implementation, though our concept is model-agnostic and can be applied with any suitable counterfactual method. We evaluate our concept on three distinct medical data sets and empirically demonstrate that counterfactual baselines yield more faithful and medically relevant attributions, outperforming standard baseline choices as well as other related methods.




Smoothed Geometry for Robust Attribution

Neural Information Processing Systems

Building on a geometric understanding of these attacks presented in recent work, we identify Lipschitz continuity conditions on models' gradients that lead to robust gradient-based attributions, and observe that the smoothness of the model's decision surface is related to the transferability of attacks across multiple attribution methods.



Supplementary Material: Attribution Preservation in Network Compression for Reliable Network Interpretation

Neural Information Processing Systems

ImageNet class labels - the class labels are unusable. In the fine-tuning phase, the pruned network is fine-tuned for 10 epochs with batch size 180. We conduct experiments for structured pruning methods on ImageNet. We observe same tendencies in the results (Table 4). Our method outperforms naive compression in terms of maintaining the attribution maps.